12/10/2019

Introduction and experimental design

  • I work with animal models of breast cancer and conducted an experiment where I treated tumor-bearing mice with the CDK4/6 inhibitor, palbociclib.
  • The dose-limiting toxicity of palbociclib is decreased white blood cell counts, so part of my experiment involved measuring the complete blood count and leukocyte differential of mice after treatment.

Project setup

  • First I loaded libraries, functions, and set my working directory.
library(tidyverse)
library(rio)
library(broom)
library(visdat)
library(plotly)
setwd("~/Downloads/Gallanis_BIOF339_final")
source("gregtheme.R") # a custom ggplot theme to match my other plots
  • I integrated my R project files into my animal experiment folder for good housekeeping. (For this presentation all the files load from the downloads folder.)

Data import

  • Then, I imported the data, which I had previously transcribed into Excel.
CBC_raw <- import("Mouse_CBCs.csv")
trt <- import("Exp_2_groups.csv")

str(CBC_raw)
## 'data.frame':    22 obs. of  19 variables:
##  $ No           : chr  "Reference_Low" "Reference_High" "2200" "2193" ...
##  $ WBC_(10^9/L) : num  6 15 3.38 2.5 2.03 2.84 4.37 1.89 3.44 3.92 ...
##  $ LYM_(10^9/L) : num  3.4 7.44 2.5 1.76 1.47 2.24 2.87 1.54 2.71 3.28 ...
##  $ MON_(10^9/L) : num  0 0.6 0.09 0.22 0.06 0.1 0.25 0.04 0.04 0.12 ...
##  $ NEU_(10^9/L) : num  0.5 3.8 0.8 0.51 0.5 0.5 1.25 0.31 0.69 0.52 ...
##  $ LY_%         : num  57 93 73.8 70.6 72.4 78.7 65.7 81.5 78.8 83.6 ...
##  $ MO_%         : num  0 7 2.7 8.8 3 3.6 5.7 2.2 1.2 3.2 ...
##  $ NE_%         : num  8 48 23.5 20.5 24.6 17.7 28.6 16.3 20 13.2 ...
##  $ RBC_(10^12/L): num  7 12 7.17 6.74 7.56 7.67 7.75 8.49 7.82 7.04 ...
##  $ HGB_g/dL     : num  12.2 16.2 11.4 10.8 11.9 11.9 11.3 13.5 13.1 10.6 ...
##  $ HCT_%        : num  35 45 37.6 39 39.8 ...
##  $ MCV_fL       : int  45 55 53 58 53 53 50 55 53 60 ...
##  $ MCH_pg       : num  11.1 12.7 16 16 15.8 15.5 14.6 15.9 16.7 15 ...
##  $ MCHC_g/dL    : num  22.3 32 30.4 27.7 29.9 28.9 29.2 29 31.4 25.1 ...
##  $ RDWc_%       : num  NA NA 19.5 21.5 20.2 20.2 18.9 20.8 20.2 18.2 ...
##  $ PLT_(10^9/L) : int  200 450 424 474 483 392 403 389 561 480 ...
##  $ PCT_%        : num  NA NA 0.24 0.32 0.28 0.25 0.23 0.24 0.34 0.29 ...
##  $ MPV_fL       : num  NA NA 5.8 6.7 5.8 6.4 5.7 6.2 6 6 ...
##  $ PDWc_%       : num  NA NA 28.5 30.2 29.1 30.2 29.6 29.1 29.1 29.6 ...

str(trt)
## 'data.frame':    20 obs. of  3 variables:
##  $ No       : int  2193 2301 2325 2316 2320 2194 2200 2304 2326 2317 ...
##  $ Group    : chr  "A" "A" "A" "A" ...
##  $ Treatment: chr  "Palbociclib" "Palbociclib" "Palbociclib" "Palbociclib" ...

Data reformatting and joining

  • I adjusted the data formats, removed extraneous information such as the reference ranges, and merged the measurements with the treatment group annotations.
trt <- trt %>% mutate(
  No = as.character(No),
  Group = as.factor(Group),
  Treatment = as.factor(Treatment))

CBC <- CBC_raw %>% slice(-c(1,2)) %>% full_join(trt, by = "No") %>% arrange(No) 
CBC <- CBC[,c(1,21,20,2:19)]

str(CBC)
## 'data.frame':    20 obs. of  21 variables:
##  $ No           : chr  "2193" "2194" "2195" "2200" ...
##  $ Treatment    : Factor w/ 2 levels "Palbociclib",..: 1 2 1 2 1 2 1 2 1 2 ...
##  $ Group        : Factor w/ 2 levels "A","B": 1 1 2 1 1 2 2 1 2 2 ...
##  $ WBC_(10^9/L) : num  2.5 2.03 1.46 3.38 2.84 2.25 2.89 4.37 2.12 2.97 ...
##  $ LYM_(10^9/L) : num  1.76 1.47 1.16 2.5 2.24 1.8 2.36 2.87 1.52 2.26 ...
##  $ MON_(10^9/L) : num  0.22 0.06 0.07 0.09 0.1 0.07 0.08 0.25 0.1 0.07 ...
##  $ NEU_(10^9/L) : num  0.51 0.5 0.22 0.8 0.5 0.38 0.45 1.25 0.51 0.63 ...
##  $ LY_%         : num  70.6 72.4 79.9 73.8 78.7 79.9 81.7 65.7 71.4 76.2 ...
##  $ MO_%         : num  8.8 3 4.8 2.7 3.6 3.3 2.7 5.7 4.7 2.4 ...
##  $ NE_%         : num  20.5 24.6 15.4 23.5 17.7 16.8 15.6 28.6 24 21.4 ...
##  $ RBC_(10^12/L): num  6.74 7.56 6.9 7.17 7.67 5.48 6.92 7.75 4.61 7.56 ...
##  $ HGB_g/dL     : num  10.8 11.9 10.4 11.4 11.9 8.2 10.2 11.3 7 11.8 ...
##  $ HCT_%        : num  39 39.8 38.6 37.6 41 ...
##  $ MCV_fL       : int  58 53 56 53 53 58 57 50 55 58 ...
##  $ MCH_pg       : num  16 15.8 15.1 16 15.5 14.9 14.7 14.6 15.2 15.7 ...
##  $ MCHC_g/dL    : num  27.7 29.9 27 30.4 28.9 25.9 26 29.2 27.6 26.9 ...
##  $ RDWc_%       : num  21.5 20.2 19.5 19.5 20.2 17.4 17.7 18.9 21 17.4 ...
##  $ PLT_(10^9/L) : int  474 483 367 424 392 538 398 403 325 563 ...
##  $ PCT_%        : num  0.32 0.28 0.23 0.24 0.25 0.41 0.24 0.23 0.2 0.36 ...
##  $ MPV_fL       : num  6.7 5.8 6.1 5.8 6.4 7.7 6.1 5.7 6.3 6.3 ...
##  $ PDWc_%       : num  30.2 29.1 29.1 28.5 30.2 31.2 30.7 29.6 30.2 29.6 ...

vis_dat(CBC_raw)
vis_dat(CBC)
Left: Before processing. Right = After processing and joiningLeft: Before processing. Right = After processing and joining

Left: Before processing. Right = After processing and joining

Taking a preliminary look

  • First I sought to take a look at the plot of all the data
liveplot <- CBC %>% gather(Measurement, Value, names(select_if(.,is.numeric)))  %>%
  ggplot() +
  geom_jitter(aes(x = Treatment, y = Value, color = Treatment),width = 0.2, height = 0.1) +
  facet_wrap(~ Measurement, scales = "free") +
  xlab("Treatment") +
  ylab("Values (mean ± SD)") +
  ggtitle("Mouse CBC values by treatment")

ggplotly(liveplot, tooltip = c("x","y","label"))

Preliminary look

Summary statistics and analyses

  • Next, I calculated mean and standard deviation of the measurements for each CBC parameter, grouped by treatment.
  • I also performed t tests to compare the means from palbocicilb-treated samples to vehicle-treated samples for each CBC parameter.
CBC_summary <- CBC %>%
  gather(Measurement, Value, names(select_if(.,is.numeric))) %>% 
  group_by(Treatment, Measurement) %>% 
  summarize_at("Value",list(mean = mean, sd = sd)) %>% 
  arrange(Measurement)

CBC_ttest <- CBC %>% 
  select_if(is.numeric) %>%
  sapply(function(i) t.test(i ~ CBC$Treatment, alternative = "two.sided")$p.value) %>% 
  tidy %>%
  data.frame %>% 
  rename(Measurement = names, two_sided_p_val = x)

head(CBC_summary)
## # A tibble: 6 x 4
## # Groups:   Treatment [2]
##   Treatment   Measurement  mean    sd
##   <fct>       <chr>       <dbl> <dbl>
## 1 Palbociclib HCT_%        38.4  7.13
## 2 Vehicle     HCT_%        40.2  3.80
## 3 Palbociclib HGB_g/dL     10.5  1.97
## 4 Vehicle     HGB_g/dL     11.2  1.27
## 5 Palbociclib LY_%         77.5  5.95
## 6 Vehicle     LY_%         75.4  4.26
CBC_ttest
##      Measurement two_sided_p_val
## 1   WBC_(10^9/L)     0.428670423
## 2   LYM_(10^9/L)     0.438061167
## 3   MON_(10^9/L)     0.743560862
## 4   NEU_(10^9/L)     0.405668715
## 5           LY_%     0.379631407
## 6           MO_%     0.118302441
## 7           NE_%     0.128814347
## 8  RBC_(10^12/L)     0.256954686
## 9       HGB_g/dL     0.326749280
## 10         HCT_%     0.490640877
## 11        MCV_fL     0.304545869
## 12        MCH_pg     0.703939032
## 13     MCHC_g/dL     0.491687024
## 14        RDWc_%     0.085109848
## 15  PLT_(10^9/L)     0.001837129
## 16         PCT_%     0.008826005
## 17        MPV_fL     0.520929762
## 18        PDWc_%     0.113294160

Adding annotations based on t test results

  • I created a small data frame containing the annotations and coordinates for drawing comparison lines
annotations <- data.frame(Measurement = c("PLT_(10^9/L)","PCT_%"),
                          x1 = c(1,1), x2 = c(2,2), 
                          y1 = c(625,0.6), y2 = c(650,0.64), 
                          xlab = c(1.5,1.5), ylab = c(670,0.67), 
                          lab = c("**","**"))
##    Measurement x1 x2    y1     y2 xlab   ylab lab
## 1 PLT_(10^9/L)  1  2 625.0 650.00  1.5 670.00  **
## 2        PCT_%  1  2   0.6   0.64  1.5   0.67  **

Putting it all together

finalplot <- CBC %>% gather(Measurement, Value, names(select_if(.,is.numeric)))  %>%
  ggplot() +
  geom_jitter(aes(x = Treatment, y = Value, color = Treatment),width = 0.2, height = 0.1) +
  geom_errorbar(aes(x = Treatment, ymax = mean+sd, ymin = mean-sd, color = Treatment), 
                width = 0.1, data = CBC_summary, inherit.aes = FALSE, show.legend = FALSE) +
  geom_errorbar(aes(x = Treatment, ymax = mean, ymin = mean, color = Treatment), 
                width = 0.2, data = CBC_summary, inherit.aes = FALSE, show.legend = FALSE) +
  geom_text(data = annotations, aes(x = xlab, y = ylab, label = lab)) +
  geom_segment(data = annotations, aes(x = x1, xend = x1, y = y1, yend = y2)) +
  geom_segment(data = annotations, aes(x = x2, xend = x2, y = y1, yend = y2)) +
  geom_segment(data = annotations, aes(x = x1, xend = x2, y = y2, yend = y2)) +
  scale_color_manual(values=c("#EB4478","#000000")) +
  scale_y_continuous(expand = c(.1,.1)) +
  facet_wrap(~ Measurement, scales = "free") +
  xlab("Treatment") +
  ylab("Values (mean ± SD)") +
  ggtitle("Mouse CBC values by treatment") +
  gregtheme()

finalplot

Putting it all together

Conclusions

  • Platelet count (concentration in blood) and plateletcrit (percent of blood volume occupied by platelets) were significantly decreased in the palbociclib-treated mice compared to the vehicle-treated mice.
    • This is an on-target effect
  • All other parameters were not significantly affected, although there was a trending decrease in neutrophil count in the palbociclib-treated group